Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix API issue about missing the status code in the events and logs endpoints #24406

Merged
merged 4 commits into from
Nov 4, 2024

Conversation

Luap99
Copy link
Member

@Luap99 Luap99 commented Oct 29, 2024

see commits

Fixes #23712

Does this PR introduce a user-facing change?

The events and container logs API endpoints correctly send the status code right away now and not wait for the fir event/log line.

Copy link
Contributor

openshift-ci bot commented Oct 29, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 29, 2024
@github-actions github-actions bot added the kind/api-change Change to remote API; merits scrutiny label Oct 29, 2024
@Luap99 Luap99 force-pushed the event-api-response branch from 2edd970 to 9d130c8 Compare October 29, 2024 16:02
@Luap99 Luap99 force-pushed the event-api-response branch from 9d130c8 to 0420921 Compare October 29, 2024 17:08
Using the internal Wait() API over the events API as this is much more
efficient. Reading events will need to read a lot of data otherwise.

For the function here it should work fine and it is even better as it
does not depend on the event logger at all.

Signed-off-by: Paul Holzinger <[email protected]>
This type is unsused, undocumented and basically broken. If this would
be used anywhere it will just deadlock after writing 100+ events without
reading as the channel will just be full.

It was added in commit 8da5f3f but never used there nor is there any
justification why this was added in the commit message or PR comments.

Signed-off-by: Paul Holzinger <[email protected]>
One of the problems with the Events() API was that you had to call it in
a new goroutine. This meant the the error returned by it had to be read
back via a second channel. This cuased other bugs in the past but here
the biggest problem is that basic errors such as invalid since/until
options were not directly returned to the caller.
It meant in the API we were not able to write http code 200 quickly
because we always waited for the first event or error from the
channels. This in turn made some clients not happy as they assume the
server hangs on time out if no such events are generated.

To fix this we resturcture the entire event flow. First we spawn the
goroutine inside the eventer Read() function so not all the callers have
to. Then we can return the basic error quickly without the goroutine.
The caller then checks the error like any normal function and the API
can use this one to decide which status code to return.
Second we now return errors/event in one channel then the callers can
decide to ignore or log them which makes it a bit more clear.

Fixes c46884a ("podman events: check for an error after we finish reading events")
Fixes containers#23712

Signed-off-by: Paul Holzinger <[email protected]>
API clients expect the status code quickly otherwise they can time out.
If we do not flush we may not write the header immediately and only when
futher logs are send.

Fixes containers#23712

Signed-off-by: Paul Holzinger <[email protected]>
@Luap99 Luap99 force-pushed the event-api-response branch from 0420921 to e6d9878 Compare November 1, 2024 17:54
@Luap99
Copy link
Member Author

Luap99 commented Nov 4, 2024

@containers/podman-maintainers PTAL

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by comments, I only looked at the small commits and I don’t understand the codebase.

func (e EventJournalD) Read(ctx context.Context, options ReadOptions) error {
defer close(options.EventChannel)
func (e EventJournalD) Read(ctx context.Context, options ReadOptions) (retErr error) {
runtime.LockOSThread()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it obvious why this is necessary? If not, a comment would be nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah I think I can drop this gain. It is not doing anything AFAICT especially since we span a new goroutine in here anyway where we call from a different thread then. I think I added this because I had same badfd/closed fd errors that were caused by a bug that closed the journal fd to early but I fixed that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I’m not at all saying that it’s unnecessary; just that, reading the PR, I don’t know what to look for.)

Copy link
Member Author

@Luap99 Luap99 Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I got that but I remembered why I added it and that was chasing a ghost that turned out to be a simple close to early when I was debugging locally.
I agree that it is confusing as it is not clear to readers why this is here so it should just be dropped. I will do so tomorrow.

@@ -106,6 +106,13 @@ func LogsFromContainer(w http.ResponseWriter, r *http.Request) {

w.WriteHeader(http.StatusOK)

flush := func() {
if flusher, ok := w.(http.Flusher); ok {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so, but this is pattern used in many endpoints so I guess we could do such change in one solid PR to switch all callers over

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, let’s defer that.

@baude
Copy link
Member

baude commented Nov 4, 2024

lgtm

@baude baude added the 5.3 label Nov 4, 2024
@rhatdan
Copy link
Member

rhatdan commented Nov 4, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 4, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 0f25d9e into containers:main Nov 4, 2024
76 of 77 checks passed
@Luap99 Luap99 deleted the event-api-response branch November 5, 2024 10:28
Luap99 added a commit to Luap99/libpod that referenced this pull request Nov 5, 2024
This is not needed and was added by during debugging but it truned out
to be something else. We should not lock the thread unless needed
because this just raises question why it is here otherwise.
Also the lock would not do much as we spawn a goroutine below anyway so
it runs on another thread no matter what.

From the review comment by Miroslav but it was merged before I had the
chance to fix it:
containers#24406 (comment)

Signed-off-by: Paul Holzinger <[email protected]>
Luap99 added a commit to Luap99/libpod that referenced this pull request Nov 5, 2024
This is not needed and was added by during debugging but it turned out
to be something else. We should not lock the thread unless needed
because this just raises question why it is here otherwise.
Also the lock would not do much as we spawn a goroutine below anyway so
it runs on another thread no matter what.

From the review comment by Miloslav but it was merged before I had the
chance to fix it:
containers#24406 (comment)

Signed-off-by: Paul Holzinger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5.3 approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/api-change Change to remote API; merits scrutiny lgtm Indicates that a PR is ready to be merged. release-note
Projects
None yet
5 participants